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ABSTRACT 

The purpose of this paper is to present a 
generalization of the concept of item difficulty to test items that 
measure more than one dimension. Three common definitions of item 
difficulty were considered : the proportion of correct responses for a 
group of individuals; the probability of a correct response to an 
item for a specific person; and the' location of the -item along a 
difficulty continuum, this paper defines the difficulty of an item 
that measures more than one dimension as the direction from the \ 
origin, of the multidimensional space to the point of greatest 
discriminating power and the distance from the origin to that point, 
he direction can be given in terms of angles with the coordinate 
axes or the corresponding direction Cosines. The distance is a signed 
/number using the same units as the coordinate axes. For the* • 
'unidimensional case, the definition simplifies to the b-parameter 
from unidimensional item response theory. (BW) 
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The purpose of this paper is to present a generalization of the concept 
of item difficulty to test items, that measure more then one dimension. In 
developing the generalisation of item difficulty, three common definitions of 
difficulty were considered. The first definition is the proportion of correct 
responses for a' group of individuals. This is the common jj-value discussed in 
many measurement books. This conception of item difficulty yields a result 
that is specific to the group lie ing used to determine the £-value. It is 
descriptive of the interaction of the group of persons with the item and does 
not tell how difficult the item is for any particular person. 

A second definition of item difficulty is the probability of a correct 
response to an item for a specific person. This indication of item difficulty 
can be determined using an IRT model and an estimate of a« person's ability. 
Of course, for this estimate of item difficulty to be accurate, the IRT model 
selected must be ah accurate representation of the interaction of a person and 
an item. Unlike the previous definition, this indication of ittsm .difficulty 
is not group specific. The £-value for a specific group can be determined 
from the probability of a correct response for each person by averaging over 
persons. t •' 

. '.. • .. ■ • 

The third definition of item difficulty is the location.of the itefe. along 
a difficulty continuum. The first two definitions yield a value that can be 
interpreted in this way, but they have the disadvantage of being specific to a 
group or to a person and not being solely a characteristic of the test item. 
In IRT, each item is assumed to have a difficulty parameter that is solely a 
characteristic of the item and is independent of the persons taking the 
item.^ The qther two types of difficulty statistics' can be derived from the 
IRT model and infbrmation about the ability of persons in a group. ' * 

Since the first two conceptions of item difficulty can be used as 
measures of difficulty on a continuum, and since the third definition can be . 
used to derive the statistics 6sed for the previous^wb, the third definition , 
has been selected as the basis of the gener*riaationWf the difficulty concept 
td more than one dimension. More specifically, th'e IRT notion of item 
difficulty will be extended to handle more than o'ne dimension. 
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" The Item Response Theory Definition 

of Item Difficulty 

For the unidimeWona! case, ite/n difficulty is defined in IRT aa the 
point on the ability scale corresponding to the point of inflection of the, 
item characteristic curve (ICC). This point can be determined by solving for 
the point of. inflection of the ICC by taking the second- derivative of the item 
response function , seating it equal to se.ro, and solving for the non- 
degenerate root. For exdmp.le r the difficulty parameter for the two parameter 
logistic model, given by* - . . / . . * 



a(9 - b) 
a(8 - b) 



P(x|e, a, b) > - " 7T— CI) 



inhere 8 is the ability parameter and a and b are -1 tern parameters, can be 
determined by setting the second deri vat ive""with .respect to the ability 0 
parameter equal to zero. 



* P(xle f a, b) g a a pQa ^ 2p) 4 o ; (2) 



and then solving for its roots* In this case a ?<Q f P 0. 9 « 0 are ; 
degenerate cases* and only P * .5 yields a meaningful~solution* \t can be j 
shown that .£;•'•$■■ wfoen e * b# Thus/ the difficulty of the item is indicated 
by the point on the e-scale equal to the b-parameter in the model. Therefore, 
b is the difficulty parameter for this model* 

For unidimensional IRT models* the point on the ability scale ihdicated ' 
by the b-parameter is the point where the ICC Is the steepest* For the two* 
parameter logistic model* this also indicates 1 the point on the ability scale 
where the item is most informative* 

In order to generalize the IRT definition of item difficulty to* v 
multidimensional items* the form of the item response function must first be 
determined. For. the multidimensional* case, the item response function gives' 
the probability of a correct response to an item- for a person with a 
particular Vector of , abilities, k number of different forms* h^ve been ^ \ 
proposed foi^iis function (see Reckase & McKinisy, 1983 for several 
examples). Hn all cases the functions have been assumed to increase \ 
monotonical|^Por all combinations of dimensions. The Surface defined by^a 
^multidimensional item response function had been labelled an item response 
surface (IRS). 
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This paper wilt use a specific IRS, a multidimensional extension of the i , 
two-parameter logistic model (M2 PL >, t9 demonstrate >he concepts' being 
developed. This model is gi^n by *' , * . 



P(xfe. t a., d. ) » 

' J w . *> 



x(d/ + a.* 9j) 



We 



(d. > a.' 8.) 

■ «J ■ 



(3) 



where P(x i j|0j, a^ d^) is the probability of response x (0 or 1) to Item i- 

for Person j, 0. is a vector of abilities for Person j, a. is an* item ' 

' . ••• • J : >'-', \ •,„■ . "•' •' ...V '•. •. - 

parameter vector for Item i and di is a scalar item parameter for Item i. The 

roles of the «, and di parameters for this model, will be described later in , 

otitis paper. Two examples of an' IRS defined by the M2PL model for the 'two * 
dimensional case/are given in Figure J . • • 
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Definition of Multidimensional Difficulty 



The definition of difficulty for multidimensional test items that is 
.proposed in. this paper has three purposes: (a) it describes. the , 

r characteristics of the item so that.it can be compared to other items; $b) it 
gives an indication of the location of the item in the multidimensional 
ability space; and <q) it tells where J?he, item is most informative. 
•'• * ••■ ". ; :; ' • ' :' .* ' ' - % ' 

The definition of difficulty Is developed* as an extension of the 
unidimensionai IRT definition of difficulty. As given above, difficulty is 
defined as the location of the point \t inflection qf the ICC on the ability, 
continuum. .For the multidimensional case, difficulty will be def ined as the' 
distance arid direction of the point of steepest slope from the origin of the 
multidimensional space, for 'the M2PL' model given previously, this point is 7 
the closest point to the origin in the locus of points of inflection for the 

. HIS. While most ohe-diraensional IRtf models have only one point of inflection, 
the multidimensionai models have many. In 'fact, the locus of points of , ' 

•inflection is usually a function of the ability vector in the multidimensional 
space . The horizontal lines on the IRSs in Figure' I shows* the locus "of points 
of inflection for these items. 

In order to determine the location' of a multidimensionai item in the : 
multidimensional space, two steps must be performed. First, the locus of * 
points of inflection must be determined. Second,, the distance of the locus of 
points of inflection from the origin of the ability space must be 
determined. The distance 'is taken from the origin to the closest point on the 
locus of points of inflection. , Thus, the multidimensional difficulty of a ••* 
test iteai has two components; (a) the distance of the locus of points of 
inflection from the origin of the ability space, and (b) the direction from 
the origin to the closest point. * , 

Locus of Points of Inflection 

For the unidimensionai IRT models, the point of inflection of the ICC is 
determined by taking the second derivative of the item response function and 
solving for its root. For the multidimensionai case, the same procedure is 
.followed, but since the characteristics of the surface depend on the 
direction, the second directional derivative i$ used instead of the simple 
second derivative. « ' 

The second directional derivative gives the rate of change of the slope 
of the surface in a particular direction. For a multidimensional IRT model", 
the "second directional derivative is given by 



' ' (4) 

40, 
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; v ft " cos* . cos*. ♦ —i— oos m , 



where P is. the item response function ♦ is the vector of angles with the in- 
coordinate axes, and e 1f • • • , 6 are the m ability dimensions. 

* * *0 * 



For Che model presented in Equation 3, the. second directional derivative 
simplifies to " ' 



' % ' *i a P *Wl •» ?P) (a, cos* t + a, cos*, + . . . ♦ a' cos* ) 2 , (5) 

•'.I ^ • ***eV * „ 

where all of; the variables have bcjien defined previously, 

,p - '.. " *' i ■ . • • # ■' # ' ■ 

As with the 2~parametei* logistic model,, the only non-degenerate solution 
is the case when P * \y. When P * .5. the exponerit *of Equation 1 must be 0* 
Therefore,^ the locus of points of | inflection for this model is givexvby ' 

.■ . . \ ■ . A , , • ,> . • ' ■ V *V •. 

» r 7 1 .. <••"'.,. 
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This is the equation for a hyperplane in a m-dimensionai space v ' 

The proposed indicators of difficulty for a multidimensional test item 
are the shortest distance between the locus of points of inflection and the '/• 
origin of the space, and the direction used to obtain the shortest distance, 
the shortest distance between the origin and the hyperplane of inflection io 
a,iong a perpendicular to the hyperplane. Th4 direction cosines of a line that 
is perpendicular to the hyperplane of inflection are given by 

, • . . '."'>..!•'.'. * ' >.,• • '• ...*'•' • '.V 

a.. 

A 



cos*. » -=j=r : (7) 



f£a ? 

J* 



where *. is the angle between the line and the ith dimensional axis, and a 4* 
is the jth term in the item, parameter vector a£« 

The distance of the hyperplane of inflection from the origin for this 
model is given by • • * 

• * » 

-v • ^ 

D i - 7== , (8) 



£a 2 ' 

! s 

where di is the scalar parameter in the exponent of the model and aji has been 
previously defined. ' ' 



The Values given by Equations 7 and 6 have a fairly clear 
interpretation/ The angles defined by Equation 7 through the direction 
cosines indicate .the direction with respect to the ability dimensions in which 
Che item provides the most information. Since the direction specified is 
perpendicular to the line of inflection, the slope of the IRS is ^teepest in 
that direction. Thus the item is best at measuring a weighted composite of 
abilities defined -by 



to 



Composite 9 I *cos<fr. 8.. / (9) 
1*1 



If a set of items were selected that have the same direction cosines', the test 
would operate as if it were unidimensional since pilyof the items discriminate 
best for, the same composite of abilities. If one of t»he direction cosines 
were 1.0. the rest would be O'.O, and the item would b6 a pure measure of one 
of the abilities. ■<>■; ' . / > 

The distance of the line of inflection' from the .origin can be interpreted 
in much the same way as the b-parameter for, unidimensionai IRT models. For 
,two items with the same direction* cosines* thelteni with the larger positive 
D- value will have a smaller i proportion of individuals obtaining a correct 
response. Negative values of D generally indicate easy items, but only items 
with the same direction cosines can be direct ly>compared. 

' ■ ' . . • * ' • • " ' . • ''■ •• '■' ' . . 

* An Illustrative Examp le 

- '\ ■ ' ">' • "• • , . ' <■ . . . . •;• ; .v; ', 

In order to demonstrate the application of the concept of 
muitidimena4*rtal item difficulty K 4 30-item, multiple choice examination with' 
two distinctly different kinds of items, spelling and grammar, was analysed 
using the M2PL model using a program developed by McKiniey & Reckase ( 1983) . " 
The data on this test were obtained from the administration of the items to 
1000 students at the University of Texas at Austin as part of an entrance 
• battery. ./ ; .; •• y- 

/ ' • . • • » - \ 

Table 1 gj^ves the- item parameter estimates for. a two dimensioned solution 

for the M2PL model as tretl as the direction relative to the 9 -axis and the 

•• / ".; " v ■ * • . 1 *\ • ; 

distance to the point ok inflection closest to the origin. Note thau the 
angles. with the 6 -axis for the first 1$ itdros tend^o cluster near 

/ ' • , ' ■ * • ; \ 

/ -.!'••'•• , • \ 

0* U »U1.09), indicating that these items are best at measuring 
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the e ability. Since! these items are ail punctuation items, 8 can be 

1 . • * • \ 

labelled as a punctuation dimension. The angles with the 0 -axis of th$ 

second 15 items tend to cluster around 90* (<> * 77.15). This implies that 
these items are best at measuring 8 . Since these items are ail grammar 

*» , 

items, e can be labelled as a grammar, dimension. 
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Table 1 



It&n Parameter Estimates and 
Difficulty Statistics for 30 Punctuation 
and Gramma* Items 



Item 



Item Parameter Esti ma t e s 



Directional Estimates 
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3.22 


' 1.1? ' 




13 


-0.94 


0.69 


0.41 


14 


2.27 


1.39 


0.23 


15 


2.04 


1.57 


0.01 


16 


1.12 


v 0.47 


0.89 


17 * 


-2.68 


6.33 


0.85 


18 


-0.06 


0.20 


0.41 


19 \ 


1.44 


0.02 


0.57 


20 - 


0.34 


-0.22 


0.-51 


21 


1.58 


0.10 


* ' 0.73 


22 


-0.30 


0.09 


0.78 


23* 


0.50 


0.31 


' 0.55 


24 


-0.85 


0.26 


1.28 


25 


-0.30 


♦-0.26 


0.61 


' 26 


< 1.04 


0.22 


0.32 


27 


-0.79 


0.25 


0.73 


28 


0.40 


0.20 


0.47. 


29 - ' 


-1.66 


0.07 


0.80 


So 


i'.n 


0.26 


• 0.55 



4.83 
8.37 
19*. 52 
12.14 
14.44 
11.59 
'12.22 
7.68 
14.66 
9.09 
7.19 
4.09* 
30.72 
9.40 
0.36 
62.16 
68.78 
64.00 
87.99 
113.33 
82.20 
• 83°.42 
60.59 
78*. 52 
113,09 
55.49 
71.10 
66.95 
85.00 
64.70 



2.22 
1.05 
0.79. 
0.23 

■ 1.42 
1.76 
1.60 
1.42 
0.64 
2.02 
0.72 
2.87 

-1.17 
1.61 
1.30 
1.11 

-2.94 

-0.13. 
2.52 
0.61 
2,14 

-0.38 
0.79 

-0.65 

-0.45 
2.68 

rl.02 
0.78 

-2.07 
1.82 « 



The D-values in table 1 indicate the, distance of the nearest point of 
inflection to the origin of the 6-space. If two items measure effectively in 
the same direction from the origin, the D-values indicate the relative ' 
difficulty of the items. For example, items 20 and 25 are measuring t 
approximately the same combination of abilities. Since the D-value for Iteffl 20 
is .61 and the D-value for Item 25 is -.45, Item 20 is estimated to be more 
difficult than Item 25 on the particular composite measured by the items.. When 
the direction of the item is taken into account, the D-values can'be interpreted 
much like the b-parameter estimates from unidimensional item response theory. 



# . Summary .and Conclusions 

. ' '■■ " - 'a • • 

A definition o£ item difficulty has, been suggested for use witfc items that 
require ability on more than one' dimension for a correct response. This 
definition has two components: (a) the direction from the origin of the 
multidimensional spac**for which the item provides the most information; andj (b) 
the distance from the origin of the/space to the point of steepest slope on the 
IRS. This definition was demonstrated for tt^e multidimensional extension of the 
two-parameter logistic model. * S 

The statistics" provided by this definition can be directly applied to the 
process of test conaj ruction.^ If attest that" measures the abilities that define 
' the latent space is desired, items should be selected that have directions that 
parallel the axes of the space. . Hot^evfer, if it ids merely desirable to construct 
a test that operates as if it were unidigiensional, then items that have the same 
direction should be selected. These items measure the same composite of 
abilities. , . - . # 

• ■ . : . . # " . d v \ ' ' 

The D~st*tistics provided By ,fche definition gives the distance* from the 
origin to~the nearest point pf inflection. This value can be interpreted in the 
i same way as the b-parameter f rom»unidimeftsional IRT model when items measure in 
the same direction. • # 1 ; ^ * 
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The paper defines the difficulty of an item that measures more than one ** - 
dimension as the direction from the origin «f the multidimensional* space to 
the point of greatest discriminating power and the distance from the origin to 
that point, the direction can be given in terms of angles wltjh (he coordinate 
axes or the corresponding direction cosines v The distance is a signed, number 
using the same units as the coordinate axes. For the unldlmenslonal case, the 
definition simplifies' to the b-*parameter from' unldlmenslonal item response 



